Discovering Predictive Association Rules
نویسندگان
چکیده
Association rule algorithms can produce a very large number of output patterns. This has raised questions of whether the set of discovered rules \over t" the data because all the patterns that satisfy some constraints are generated (the Bonferroni e ect). In other words, the question is whether some of the rules are \false discoveries" that are not statistically signi cant. We present a novel approach for estimating the number of \false discoveries" at any cuto level. Empirical evaluation shows that on typical datasets the fraction of rules that may be false discoveries is very small. A bonus of this work is that the statistical signi cance measures we compute are a good basis for ordering the rules for presentation to users, since they correspond to the statistical \surprise" of the rule. We also show how to compute con dence intervals for the support and con dence of an association rule, enabling the rule to be used predictively on future data.
منابع مشابه
Using Soft-Matching Mined Rules to Improve Information Extraction
By discovering predictive relationships between different pieces of extracted data, data-mining algorithms can be used to improve the accuracy of information extraction. However, textual variation due to typos, abbreviations, and other sources can prevent the productive discovery and utilization of hard-matching rules. Recent methods for inducing softmatching rules from extracted data can more ...
متن کاملExtraction of Interesting Association Rules Using Genetic Algorithms
The process of discovering interesting and unexpected rules from large data sets is known as association rule mining. The typical approach is to make strong simplifying assumptions about the form of the rules, and limit the measure of rule quality to simple properties such as support or confidence. Support and confidence limit the level of interestingness of the generated rules. Comprehensibili...
متن کاملUsing Association Rules for Fraud Detection in Web Advertising Networks
Discovering associations between elements occurring in a stream is applicable in numerous applications, including predictive caching and fraud detection. These applications require a new model of association between pairs of elements in streams. We develop an algorithm, Streaming-Rules, to report association rules with tight guarantees on errors, using limited processing per element, and minima...
متن کاملApriori Multiple Algorithm for Mining Association Rules
One of the most important data mining problems is mining association rules. In this paper we consider discovering association rules from large transaction databases. The problem of discovering association rules can be decomposed into two sub-problems: find large itemsets and generate association rules from large itemsets. The second sub-problem is easier one and the complexity of discovering as...
متن کاملCyclic Association Rules
We study the problem of discovering association rules that display regular cyclic variation over time. For example, if we compute association rules over monthly sales data, we may observe seasonal variation where certain rules are true at approximately the same month each year. Similarly, association rules can also display regular hourly, daily, weekly, etc., variation that is cyclical in natur...
متن کاملOn the Discovery of Interesting Patterns in Association Rules
Many decision support systems, which utilize association rules for discovering interesting patterns, require the discovery of association rules that vary over time. Such rules describe complicated temporal patterns such as events that occur on the “first working day of every month.” In this paper, we study the problem of discovering how association rules vary over time. In particular, we introd...
متن کامل